Multistage coarticulation model combining articulatory, formant and cepstral features

نویسندگان

  • Yuqing Gao
  • Raimo Bakis
  • Jing Huang
  • Bing Xiang
چکیده

We describe a multi-stage speech production model containing a linear, phoneme-independent coarticulation lter, followed by a nonlinear component. The latter generates two cepstra which are then additively combined: one corresponding to a relatively smooth background spectrum, and the other representing three formant-like spectral peaks. A neural net is used for both parts, but the second part also utilizes a hard-coded function that generates exactly three spectral peaks. A uni ed model of training, adaptation, and decoding is developed, each operation di ering only with respect to prior probability distributions. Prior probabilities can be introduced at each stage of the model, providing a exible framework for utilizing both speci c and general prior knowledge. We demonstrate the use of this model for speech synthesis as well as recognition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Vocal tract inversion by cepstral analysis-by-synthesis using chain matrices

Acoustic-to-articulatory inversion for vowels is performed by cepstral analysis-by-synthesis, using chain-matrix calculation of vocal tract (VT) acoustics and the Maeda articulatory model. The derivative of the VT chain matrix with respect to the area function was calculated in a novel efficient manner, and used in the BFGS quasi-Newton method for optimizing a distance measure between input and...

متن کامل

Formant trajectories for acoustic-to-articulatory inversion

This work examines the utility of formant frequencies and their energies in acoustic-to-articulatory inversion. For this purpose, formant frequencies and formant spectral amplitudes are automatically estimated from audio, and are treated as observations for the purpose of estimating electromagnetic articulography (EMA) coil positions. A mixture Gaussian regression model with mel-frequency cepst...

متن کامل

The Effect of Stress and Speech Rate on Vowel Coarticulation in Catalan Vowel-Consonant-Vowel Sequences.

PURPOSE The goal of this study was to ascertain the effect of changes in stress and speech rate on vowel coarticulation in vowel-consonant-vowel sequences. METHOD Data on second formant coarticulatory effects as a function of changing /i/ versus /a/ were collected for five Catalan speakers' productions of vowel-consonant-vowel sequences with the fixed vowels /i/ and /a/ and consonants: the ap...

متن کامل

Adaptation of cepstral coefficients for acoustic-to-articulatory inversion

Acoustic-to-articulatory inversion of speech signals via an analysisby-synthesis method requires the comparison of natural and synthetic speech spectra either indirectly via formant frequencies, or directly via cepstral coefficients. This paper investigates several strategies of cepstral adaptation (affine transformation of cepstral coefficients, bilinear or piecewise linear frequency warping) ...

متن کامل

Consonant context effects on vowel sensorimotor adaptation

Speech sensorimotor adaptation is the short-term learning of modified articulator movements evoked through sensoryfeedback perturbations. A common experimental method manipulates acoustic parameters, such as formant frequencies, using real time resynthesis of the participant's speech to perturb auditory feedback. While some studies have examined phrases comprised of vowels, diphthongs, and semi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000